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A. Field of the Invention 

The present invention relates generally to loyalty and retention programs and, 
5 more particularly, to the use of data modeling to guide such programs. 



B. Description of the Prior Art 

Businesses today focus efforts on both generating new customers and retaining 
;5 existing customers. Typically, companies tend to look only at when a customer's contract expires 
lOjtl to engage in retention efforts, and they apply a standard marketing strategy for all customers 
U without taking into account the customer's previous history with the company. By using this 
- approach for customer retention, a company often wastes time with a customer who generates 
f l little revenue for the company, while not spending enough time trying to keep a customer who is 
iff expected to generate a lot of future revenue for the company. Further, a company may apply the 
15 ~~ same incentives to both categories of customers. 

Another problem with this retention technique is that contract expiration is not the only 
time that a company should solicit the renewal of a contract. Sometimes it is better to contact the 
customer after a contract has been renewed, or further, not to contact the customer at all 
Some companies compare data models with new customer information to make 
20 determinations about how to approach certain aspects of a customer's account, including how 
long a customer will stay with the company, also known as survival. Some statistical 
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approaches for the analysis of survival data deal with noting observations about "hazard 
functions" where the parameters for each technique vary. A hazard function is a formula 
representing the probability of a customer's termination of service based on previous behavior 
derived from a stored data set. Fig. 1 graphically represents an example of a hazard function. 
5 Hazard values, which represent the probability of a customer churning, or terminating the 

contract, are plotted along the y-axis. The age of the customer is plotted along the x-axis. The 
graph displays a spike at 12 months which demonstrates that this customer is more likely to cease 
business dealings with the company at 12 months, which coincides with the termination of that 
customer's contract. 

10:7s Parametric survival models estimate the effects of covariates (subject variables whose 

M: values influence independent variables) by presuming a lifetime distribution of a known form, 
- such as an exponential or WeibulL Although such models are popular for some applications, 
' including accelerated failure models, the smoothness of these postulated distributions makes 
X them inappropriate for survival data with contract expiration dates that provide natural spikes in 
1 5 the models for such data. 

In contrast, the Kaplan-Meier method is "non-parametric" and provides hazard and 
survival functions with no assumption of a parametric lifetime distribution function. However, it 
is difficult to use this method to estimate the effects of covariates on the hazard and survival 
functions. Subsets of customers can generate separate Kaplan-Meier estimates, but sample size 
20 considerations generally require substantial aggregation in the data so that many customers are 
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assigned the same hazard and survival functions regardless of their variation on many potential 
covariates. 

The proportional hazard method assumes that each customer's hazard function is a 
multiple of a single baseline hazard. This multiplier uses functions that are generally linear 
5 which tends to assign extreme values to subjects with extreme covariate values. Further, the 
presumption of the proportional hazards is restrictive in that there may not be a single baseline 
hazard for each subject, and the form of that baseline's variation may not be well modeled by the 
time-dependent covariates or stratification that are the traditional statistical extensions of the 
!» original proportional hazard model. 
lOyj Companies also use neural networks for predictive modeling and modeling relationships 

U whose form is unknown. Because neural networks are non-linear, universal function 
L approximators, they overcome the proportionality and linearity constraints imposed by the 
r . s statistical approaches for modeling survival data. Neural networks have previously been used to 
predict actual tenure of a customer, but the informatio 
15"* n generated by these neural networks falls short in utilizing 

this information to focus marketing techniques as the information only speaks to the actual tenure 
of the customer, not the probable future tenure of the customer. 

In addition to the models stated above, companies use the concept of customer lifetime 
value in order to value customers. Customer lifetime value (LTV) is the measure of the profit 
20 generating potential, or value, of a customer and is a useful tool in evaluating the high-value 
customers. LTV is composed of two independent components, value and tenure. Value 
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incorporates the concepts of account revenue and fixed and variable costs. It is important in the 
prediction of LTV to incorporate the estimated differentiated tenures for every customer with a 
given service supplier, based on the usage, revenue, and sales profiles contained in company 
databases. Tenure prediction models generate, for a given customer, i, a hazard curve or a hazard 
5 function that indicates the probability h^t) of cancellation at a given time, t, in the future. A 
hazard curve can be converted to a survival curve or a survival function, which plots the 
probability Sj(t) of survival, or non-cancellation, at any time, t, given that customer, i, was active 
at time (t-1), i.e., S^S/t-lMl-hXt)] with SX1)=1. As such, the LTV=£ (t=1J) S t (t) x v^t) where 
Vj(t) is the expected value of customer, i, at time, t, and T is the maximum time period under 
10^ consideration. This LTV calculates customer specific estimates of total expected future profit 
based on customer behavior and usage patterns. 
* One of the shortcomings of using valuation by revenue and LTV is that customer 

F 1 valuations by revenue and LTV ignore the potential effects of company actions, particularly 
~ retention and service actions. 
if* It is therefore desirable to provide techniques that overcome the limitations of existing 

methods to calculate probabilities of tenure of a customer to focus marketing techniques. 



SUMMARY OF THE INVENTION 

To overcome the limitations of existing techniques to model data, and in accordance with 
20 the purpose of the invention, as embodied and broadly described herein, methods and systems 
consistent with the invention includes a method for evaluating customer value to guide loyalty 
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and retention programs including calculating an individual customer's tenure based on attributes 
relating to a plurality of current customer accounts; generating a hazard function for each of a 
plurality of new customers to determine probability of churn based the individual customer's 
tenure; calculating a gain in lifetime value for each of the plurality of new customers; and using 
5 at least one of the hazard function and gain in lifetime value for each of the plurality of new 
customers to focus loyalty and retention programs. 

It is to be understood that both the foregoing general description and the following 
detailed description are exemplary and explanatory only and are not restrictive of the invention, 
as claimed. Advantages of the invention will be set forth in part in the description which 
ldtl follows, and in part will be obvious from the description, or may be learned by practice of the 
U invention. The objects and advantages of the invention will be realized and attained by means of 
3 the elements and combinations particularly pointed out in the appended claims. 

S BRIEF DESCRIPTION OF THE DRAWINGS 
15~ The accompanying drawings, which are incorporated in and constitute a part of this 

specification, illustrate an implementation of the invention and, together with the description, 
explain the goals, advantages and principles of the invention. In the drawings, 
Fig. 1 is a graphic representation of a standard hazard function; 
Fig. la is a block diagram of a system in which methods consistent with the present 
20 invention may be implemented; 

Fig. 2 is a schematic diagram of an example of a neural network; 
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Fig. 2a is a flowchart representing a method of training a neural network in a manner 
consistent with the present invention; 

Fig. 3 is a flowchart representing a method of generating a hazard function for a particular 
customer in a manner consistent with the present invention; 
5 Fig. 4 are graphic representations of four types of clustering of hazard functions generated 

by a neural network in a manner consistent with the present invention; 

Fig. 5 is a graphic representation of a gain in lifetime value generated in a manner 
consistent with the present invention. 

lOg DETAILED DESCRIPTION 
L Reference will now be made in detail to an implementation consistent with the present 

« invention. Wherever possible, the same reference numbers will be used throughout the drawings 
j J " to refer to the same or like parts. 

|^ Methods and systems consistent with the present invention provide for a neural network 

15" which is generated using functions and attributes from customer account information and is used 
to generate hazard functions from which expected tenure of a customer can be determined and 
marketing techniques can be focused. The analysis of these hazard functions can direct 
marketing and retention efforts and can be explained with the example of cellular telephone 
service. The present invention can relate to any business and its referral to cellular telephone 
20 service is merely used as an example. 
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Fig. la shows an example of a system in which the present invention may be 

implemented. The system comprises of an input device 2, display 4 and a computer 6 which 

includes a memory 8 and a central processing unit 9. 

The system uses a multilayer feed-forward neural network which is useful in modeling 

5 relationships whose form is unknown because it is a non-linear, universal function approximator. 

It can overcome the proportionality and linearity constraints imposed by traditional survival 

analysis techniques and provide more accurate hazard and tenure models. With standard neural 

network fitting, a target is set for the mechanism to predict a target function with a complicated 

!y nonlinear function of covariates. A random sample of responses to a particular marketing 

IOlI campaign, for instance, might furnish targets in the form of l's and O's based on the parity of a 

u& customer response to the campaign. 

« The customer data for modeling has, in addition to a variety of independent input 

P § attributes, two important attributes: tenure and a censoring flag. TENMON is defined as the 
S customer tenure in months and a CHURN flag indicates if the customer is still active or has 
15" terminated service. If CHURN=0, the customer is still active and TENMON indicates the 

number of months the customer has had service; if CHURN=1, the customer has cancelled and 
TENMON is the number of months of service at the time of the cancellation. In order to model 
customer hazard for the period [1,T], where T is the maximum period of time, for every 
customer, i, a target vector is created {h s (l), h^t), h^T)} with the following values (for 
20 Ut<T): 

(0 l<t<TENMON 
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h,(t) = { 1 CHURN=1 & TENMON <t^T 
{d/n t CHURN-0 & TENMON<t^T 

In this formula, is the number of cancellations in time interval t; n t is the number of 
customers at risk, i.e., the total number of customers with TENMON=t. The ratio d/n t is the 
hazard estimate used in the Kaplan-Meier survival estimator for the time interval indexed by t. 
The target hazard is set to 0 when a customer is active, 1 when a customer has cancelled, and to 
the Kaplan-Meier hazard if censored, meaning the period of time which the probability of churn 
is sought. 

The vector {h { (t)} can be thought of as a raw hazard function for the i-th individual that 
the neural net will relate to the underlying covariates. 

The dataset is then split into training, testing and holdout, or validation, datasets. The 
train and test datasets use the target vectors described above and are used to train the neural 
network and avoid overfitting. The holdout data are used to evaluate performance of the neural 
network. 

Training the Neural Network 

Fig. 2 represents the layout of a neural network architecture implemented in a manner 
consistent with the present invention. Fig. 2a is a flow chart representing the method for training 
the neural network in Fig. 2. This method may be implemented in a computer program stored in 
memory and executed by a central processing unit. The neural network 100 is initially trained to 
generate a hazard function representing an individual customer within the company using data 
from existing customer accounts. The neural network 100 consists of nodes, or artificial 
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neurons, that receive inputs through "connections" from other nodes that are essentially artificial 
resistors. Each such connection has a value known as a weight that is analogous to the resistance 
of a resistor. Each node sums the input signal values received from its inputs after being 
weighted by the connection, and then applies a nonlinear mathematical function to determine a 
value known as the internal activation value of the node. This value is then provided, after 
processing it through an output function, as the output of the unit and then applied, through the 
resistive connections, to units in the next highest layer. For example, the outputs of layer 10 are 
the inputs to layer 20. 

Input layer 10, with its input units, is actually a dummy layer in which the internal 
activation value for each input node is simply set to an analog value provided as input to each 
node. Each input node is connected to the input of every unit in hidden layer 20. Arrow 2, 4 and 
6 represent such connections. 

Nodes in layers 20 and 30 are called "hidden nodes" because their values are not directly 
observable, unlike the nodes of the input layer 10 and output layer 40. The output of each node 
in hidden layer 20 is connected to the input of every node in the hidden layer 30. The output of 
each node in hidden layer 30 is connected to the input of every node in the output layer 40. 
Connections between each layer are requested by sets of weights 12, 22 and 32 which are 
adjusted by a standard optimization routine, for example, steepest descent. 
The output of each output unit is provided to the rest of the system as the output of the neural 

network 100. In a feedforward network, the flow of information in network 100 is in one 
direction only, from input layer 10 to output layer 40. When information is applied to the input 
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of network 100, it propagates to hidden layer 20, then hidden layer 30, and finally to output layer 
40. The value of each output node represents the probability of churn for an individual customer, 

as a function of his tenure with the company. 
Input layer 10 consists of multiple nodes x l9 x 2 , x 3 , ... x r , and each node is assigned a 
value representing attributes from a customer's account (step 60). For cell telephone service, 
these attributes may include, but are not limited to, minutes of use for local, toll, peak and off- 
peak calls, total billing, detailed billing, previous balance, charges for access, toll, roaming and 
optional features, total number of calls, number of months in service, rate plan, contract type, 
date and duration, current and historical profitability and optional features. A set of weights 12 
are assigned to each connection between the input layer and the first hidden layer. The weight 
assigned to each connection are then applied as data from each node x 1? x 2 , .... x r passes through 
the connection to each node H u , H x 2 ... H l25 within the first hidden layer 20 (step 62). These 
values are then input into the first hidden layer 20 of the neural network 100 and, at each node of 
the first hidden layer 20, all inputs are then summed together to form an internal activation value 
(v) of a node (step 64) . These values are then passed through a logistic activation function 
cp(v)=(l/(l+e A -v) to transform the internal activation of a node to its output activation (step 66). 
The logistic activation function for output nodes also ensures that the predicted hazard rates are 
between 0 and 1. The output from the first hidden layer 20 is then sent to the second hidden 
layer 30 (step 72). The same procedure is followed as the data flows through the input, hidden 
and output layers of the neural network 100. As the output layer 40 is the final layer in the neural 
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network 100, (step 72), a predicted model hazard function is output from the neural network 
(step 74). If, however, a neural network includes additional layers, processing returns to step 62. 

A comparison between the predicted and actual output by applying a relative entropy or 
cross entropy error function E=Xi/-Zk2{y ik ln(y lk //x ik ) + (l-y lk ) In (l-y ik /l-ju ik )} (step 76). The 
5 system then determines if the collective difference of the predicted output and the actual output 
are minimized (step 76). If the difference is minimized, then the training is complete. If the 
difference is not minimized, the result from the relative entropy or cross entropy error function is 
used to generate the error value based on which the weights 12, 22 and 32 in the neural network 
100 are adjusted (step 80). The variable jti ik represents the predicted output value, or posterior 
10j^ probability, for the k-th unit of the i-th input case. The variable y lk is the target value, or actual 
L value, for the k-th unit of the i-th case. The variable f { is the frequency of the i-th case. The 
v. process then returns to step 60 for the processing of another customer, 
p When the predicted and actual outputs are sufficiently close, the training of the neural 

!^ network 100 is completed. No further adjustments to the set of weights 12, 22 and 32 are made. 
15" : The neural network 100 can be formatted with any number of attributes relating to 

customer account information within the nodes x v x 2 .-.x r of the input layer 10 of the neural 
network 100. Further, the number of output nodes fij, fi 2 — ^36 can have a different value as they 
represent the number of months for which the hazard function is plotted. 

Once the neural network has been trained and a hazard function model is generated, the 
20 network is then used to calculate a hazard function for each individual customer. For every 

customer, i, the neural network outputs a predicted hazard function {fij(t)}, l^t^T. As such, the 
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output of the neural network model consists of a predicted tenure for each individual customer, 
as well as that customer's estimated hazard function. 
Generating Hazard Functions 

After the training of the neural network 100 is completed, the neural network is then 
capable of generating hazard functions for existing customers. Fig. 3 is a flowchart representing 
the method used for generating a hazard function for a customer. Using the same set of attributes 
that were utilized in the training of the neural network 100, the account data for a particular 
customer is entered into the neural network and compared with the hazard function model by 
way of the adjusted set of weights 12, 22, and 32. In the same manner the attributes were loaded 
into the input layer 10 of the neural network 100, the analogous set of attributes are input into the 
input layer 10 (step 300). The set of weights 12 is applied to each connection between each node 
of the input layer 10 and the first hidden layer 20 (step 302). The values are then summed at each 
node of the first hidden layer 20 (step 304). The logistic activation function q>(v)=(l/(l+e A -v) is 
the performed at each node of the first hidden layer 20 (step 306) . These values are then passed 
to each node of the second hidden layer 30 (step 308). 

The set of weights 22 is then applied to each connection between each node of the first 
hidden layer 20 and the second hidden layer 30 (step 302). The values are then summed at each 
node of the second hidden layer 30 (step 304) and passed through the logistic activation function 
(p(v)=(l/(l+e A -v) (step 306). These values are then passed to each node of the output layer 40 
(step 308). The set of weights 32 is then applied to each connection between each node of the 
second hidden layer 30 and the output layer 40 (step 302). The values are then summed at each 

- 13- 



EXPRESS MAIL NO. EK895748075US PATENT 

DOCKET NO. 99-836 

node (step 304) and passed through the logistic activation function <p(v)=(l/(l+e A -v) (step 306). 
As the output layer 40 is the last layer of the neural network 100 (step 308), each node at the 
output layer 40 then outputs a value between 0 and 1 representing the probability of churn for 
that particular customer at a time t. When each of these values from the nodes in the output layer 
40 are graphed, a hazard function is then generated and can be analyzed in order to determine 
what marketing techniques should be applied in order to maintain the customer account (step 
310). 

Clustering 

Through the statistical clustering of the individual hazard functions discerned by the 
neural network 100, a number of basic patterns are generated based on the overall shape of the 
hazard functions. Within each of the basic patterns, hazard functions of individual customers are 
multiples of the reference hazard. Each individual's complete hazard over the whole time period 
calculated resulted in being a multiple of a single reference hazard. 

The clustering of the hazard functions is performed, using k-means clustering. K-means 
clustering is a nonheirarchial method of clustering which initially takes the number of 
components of the population equal to the final required number of clusters. In this step, the 
final required number of clusters is chosen such that the points are mutually farthest apart. Next, 
it examines each component in the population and assigns it to one of the clusters to which its 
distance is minimum. The centroid's position is recalculated every time a component is added to 
the cluster and this continues until all the components are grouped into the final required number 
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of clusters. It can be appreciated that other methods of clustering can be used to generate clusters 
of hazard functions. 

K-means clustering can be performed by constructing a small number of statistics to 
indicate the shape of each hazard function for each customer. These may include the overall 
slope of the hazard function from 1 to 36 months; the relative size of any spike at the contract 
expiration time of 12 months, 24 months and so on; the terminal slope of the hazard curve; the 
average hazard rate as defined as the mean [h^l), -.h,(T)] where i represents an individual 
customer, t represents a particular point in time, and h : (t) represents the hazard value; the average 
hazard rate for the pre-contract expiration period (with 12 month contracts); the overall slope of 
the hazard curve calculated at (h i (t)-h i (l))/T; or the initial slope of the hazard curve calculated at 
(h^-hjCl))/*). These and other parameters can be used to define the process for clustering 
hazard functions. 

The clustering process may result in 4 clusters as shown in Fig. 4 where the constituent 
hazard functions are all nearly multiples of each other. As such, the neural network hazard 
functions are effectively four groups of proportional hazard models. 

Other clusters can be utilized to further narrow or broaden the marketing techniques as 
applied to particular customers in the cluster. 
Marketing Techniques for the Hazard Clusters 

The four clusters as shown in Fig. 4 constitute a useful customer segmentation for 
marketing techniques. These patterns have an important meaning for marketing and retention 
efforts and can be explained within the realm of cellular telephone service. The present 
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invention can relate to any business and its referral to cellular telephone service is merely used as 
an example. 

In each cluster, hazard functions are displayed which represent a broad spectrum of 
customers. Cluster 400 displays representative hazard functions for a first customer set. The 
overall shape of the five functions represented in cluster 400 indicate little activity along the 
entire function. There are no large spikes either at contract expiration or anywhere along the 
function and the probability of churn (represented on the y-axis) remains substantially the same 
throughout the time period measured (represented on the x-axis). The resultant hazard overall 
shape could be interpreted to represent customers of the "safety and security" set, who possess 
the cellular telephone service as an emergency or a convenience. The attributes of the customers 
in this cluster reveal a tendency to have a detailed bill with only a few calls per month. These 
customers generally tend to sign up for service for the purposes of emergency or safety situations 
and tend not to have high usage. Contacting these customers at contract expiration is not 
necessary as there is no propensity for the customer to terminate the contract after the contract 
period. In fact, contacting the customer at contract expiration is not recommended because it 
might alert the customer to look for alternate service providers, and thus potentially losing the 
customer to the competition. 

Cluster 410 displays representative hazard functions for a second customer set. A large 
spike is evident at 12 months indicating a high probability of churn at contract expiration with 
the probability of churn remaining elevated in comparison with the first 12 month contract term. 
The attributes of the customer set in this cluster comprise users who have a moderate flat-rate 
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access charge accommodating all their calling needs. These customers tend to have higher usage 
than the "safety and security" set of customers and could be characterized, based on their 
attributes, tend to fall into one of two categories, zero charge for minutes of use and many calls 
per month, or no detailed bill with a lot of total charge. Based on the large spike at contract 
expiration, retention efforts should be concentrated during the pre-expiration period where a 
contract renewal may not be required. 

Cluster 420 displays representative hazard functions for a third customer set. The hazard 
functions in cluster 420 evidence high probability of churn at the contract expiration period with 
high and increasing probability of churn post-expiration. The attributes of the customer set in 
this cluster tend to have high total charges and have rate plans whose flat rates do not fit their 
high calling volumes. These may well be customers who would be better served by a different 
rate plan; their high post-contract churn probabilities indicate that such improved plans are often 
obtained through alternative suppliers. As such, high intensity retention efforts should be made 
pre-expiration of the contract period with continued competitive offers in order to find a plan 
which fits the customer's calling needs. 

Cluster 430 displays representative hazard functions for a fourth customer set. The 
overall shape of the functions indicates a scaled down version of cluster 420 with an elevation in 
probability of churn at contract expiration with continued elevated probability of churn post- 
expiration. These customers may also comprise customers with inappropriate contracts. As 
such, moderate pre-expiration retention effort is needed with the need for a new contract or 
continued contact in an attempt to find an appropriate plan for the customer's calling needs. 
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Contract expiration at (usually) 12 months is a crucial event for both the customer and the 
company. An organized company could, using the hazard clusters, construct a retention strategy 
by contacting and offering significant retention inducements, for example, a new phone or a 
better rate plan, to one portion of its customers, while offering lessor, or no, inducements to the 
other portions, and ignoring, or not contacting, yet another portion. Each of these strategies, 
together with other strategies, can be put in the same economic form; by combining contact costs 
and inducement costs into an aggregated concession cost, the issue becomes the size of the 
concession to offer to each customer. 

Through the neural network, each customer has a unique hazard function and the cluster 
analysis indicates the form and desired outcome of the customer's retention approach. Analysis 
of each customer's hazard function, combined with knowledge of the expected revenue leads to 
an individual estimate of revenue gain to guide the retention effort. 
Gain in Lifetime Value 

In addition to evaluating at what point a company should contact a customer regarding 
their account, it is important to know the current and future value of the customer in order to 
determine the appropriate incentives to offer the customer. The modeling of customer lifetime 
value (LTV) has a number of applications including, but not limited to special services, for 
example, premium call centers and elite service, and offers based on customer LTV; targeting 
and managing unprofitable customers; segmenting customers, marketing pricing and promotional 
analysis based on LTV, and the sizing and planning for future market opportunity based on 
cumulative customer LTV. As such, in order to more accurately represent revenue generated by 
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a customer, the customer's entire tenure should be taken into account together with any changes 
made to the customer's account including change in tenure value and a change in revenue. 

The estimate of the expected revenue from customer is calculated by ER : = t *Rj where 
ERi is the expected revenue, t,* is the customer's new expected lifetime, and the monthly 
revenue (R,) generated by customer, i, is effectively constant for all t. This quantity can be used 
to calculate an expected gain from a successful retention effort, and can, in turn, be used to rank 
customers for targeting special marketing techniques. 

For example, assume customers are subject to a critical time period in their tenure when a 
decision is made concerning continuation of the individual's patronage. This time t 0 may be 
marked by the expiration of a contract or special promotion. Shortly before this critical time, say 
at time t 0 -A, an attempt is made restore the customer's potential future behavior what was 
estimated at the beginning of his/her service period. That is, if this retention effort is successful, 
the customer's new hazard function is translated by the interval t 0 -A . In Fig. 5, graph 502 
shown an original hazard function based on a contract expiration date of t 0 =12 months. When 
A =2 months (i.e. the retention effort is made two months before contract expiration), the 
customer's hazard function changes and that change is reflected in graph 504. 

The new hazard function is called {h* (t), t=l, 2, 3, ...}. This generates the corresponding 
survival function {S* s (t), t=l, 2, 3, ...}. The median remaining lifetime at time t 0 for the new 
hazard function can be calculated with regard to the original hazard function using the formula t* 
- 1 0 such that {S i (t*)}/(S(t 0 ) = 0.5. Note that this is not necessarily the same as the time period by 
which the hazard function is translated. Then the estimated expected revenue ER*j is calculated 
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by the formula ER* = t* R r The gain in lifetime value (GLTV) for customer, i, from a 
successful retention effort is GUTSA = ER^ - ER^O). This quantity does not necessarily measure 
the actual gain in revenue that would be achieved through a successful customer retention effort. 
The customer's future revenue may be different from that observed in the past, particularly if the 
retention tactic involved the customer switching to a new rate plan. Further, it is not necessarily 
the case that a successful retention will result in a complete recapitulation of the original hazard 
function. The suggested GLTV calculation should be taken as a measure of the relative worth of 
a customer when information about their post-retention hazard is not available. 

The GLTV concept has various operational uses. It serves as a guide for the company's 
interactions with individual customers when their retention may depend on a modified pricing 
plan or concessions. It also becomes the basis for segmenting customers into groups to which 
different retention efforts and concessions might be offered. The term "retention effort" is used 
as a collective term for the variety of company actions that occur throughout a customer's tenure 
aimed at the customer's ultimate retention. In addition to persuasion and negotiation targeted 
near a customer's contract expiration date, operations aimed at retention might include expedited 
customer care, pro-active equipment upgrades and other specially enhanced services. 

With this information, retention efforts are varied by customer segment. The segments 
are generally based on a measure of customer value, so that the highest percentage customer 
segment is subject to one (high) level of retention effort, the next percentage is subject to a 
different (and slightly lower) effort, and so on until the lowest percentage receives yet another 
(very low or nonexistent) retention effort. 
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Other embodiments of the invention will be apparent to those skilled in the art from 
consideration of the specification and practice of the invention disclosed herein. This application 
is intended to cover any variations, uses, or adaptations of the invention following the general 
principles thereof and including such departures from the present disclosure as come within 
known or customary practice in the art to which this invention and all within the limits of the 
appended claims. It is intended that the specification and examples be considered as exemplary 
only, with a true scope and spirit of the invention being indicated by the following claims. 

It will be appreciated that the present invention is not limited to the exact construction 
that has been described above and illustrated in the accompanying drawings, and that various 
modifications and changes can be made without departing from the scope thereof. It is intended 
that the scope of the invention only be limited by the appended claims. 
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We claim: 

1. A method for evaluating customer value to guide loyalty and retention programs 
comprising the steps of: 

calculating an individual customer's tenure based on attributes relating to a plurality of 
current customer accounts; 

generating a hazard function for each of a plurality of new customers to determine 
probability of churn based on the individual customer's tenure; 

calculating a gain in lifetime value for each of the plurality of new customers; and 

determining a focus for a loyalty and retention program based on at least one of the 
hazard function and gain in lifetime value for each of the plurality of new customers. 

2. The method of claim 1, wherein calculating the gain in lifetime value includes: 
calculating a lifetime value based on contract terms and revenue generated for each of the 

plurality of new customers; 

calculating the gain in lifetime value by considering a new contract period using the 
formula ER* -ER^O) = GLTV. 

3. The method of claim 1, wherein determining a focus for a loyalty and retention includes: 
analyzing the shape of the hazard function generated for each of the plurality of new 

customers; and 

specifying a set of marketing techniques based on the shape of the hazard function. 

4. The method of claim 1, wherein determining a focus for a loyalty and retention program 
includes: 
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specifying a set of incentives offered to the plurality of new customers based on the gain 
in lifetime value. 

5. The method of claim 3, wherein specifying the set of marketing techniques based on the 
shape includes: 

determining, based on the shape of the hazard function, there is no effect on churn of a 
contract expiration. 

6. The method of claim 5, wherein specifying the set of marketing techniques includes: 
taking no further steps to deter churn. 

7. The method of claim 3, wherein specifying the set of marketing techniques based on the 
shape includes: 

determining, based on the shape of the hazard function, that there is a small increase in 
probability of churn at contract expiration, with an elevated post-expiration churn. 

8. The method of claim 7, wherein specifying the set of marketing techniques includes: 
having a moderate pre-expiration effort where new contracts or continued contracts are 

the goal. 

9. The method of claim 3, wherein specifying the set of marketing techniques based on the 
shape include: 

determining, based on the shape of the hazard function, that there is a large spike 
indicating high probability of churn at contract expiration and low probability of churn thereafter. 

10. The method of claim 9, wherein specifying the set of marketing techniques includes: 
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concentrating effort on pre-expiration of contract where a contract renewal may not be 
required. 

11. The method of claim 3, wherein specifying the set of marketing techniques based on the 
shape includes: 

determining, based on the shape of the hazard function, that there is a large increase in 
probability of churn at expiration with high and increasing post-expiration probability of churn. 

12. The method of claim 11, wherein specifying the set of marketing techniques includes: 
having a high intensity pre-expiration effort with continued competitive offers to maintain 

customer. 

13. The method of claim 3, wherein specifying the incentives includes: 
determining that value of the set of incentives offered to each of the plurality of new 

customers does not exceed the gain in lifetime value. 

14. The method of claim 3, wherein analyzing the shape of the hazard function includes: 
clustering all of the hazard functions for each of the plurality of new customers so that 

hazard functions with similar shapes can be grouped together. 

15. The method of claim 14, wherein analyzing the shape of the hazard function includes: 
determining, based on the overall shape of the clustered hazard functions, what retention 

efforts to take to keep a new customer. 

16. An apparatus for evaluating customer value to guide loyalty and retention programs 
comprising: 
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a calculating module for calculating an individual customer's tenure based on attributes 
relating to a plurality of current customer accounts; 

a generating module for generating a hazard function for each of a plurality of new 
customers to determine probability of churn based on the individual customer's tenure; 

a calculating module for calculating a gain in lifetime value for each of the plurality of 
new customers; and 

a determining module for determining a focus for a loyalty and retention program based 
on at least one of the hazard function and gain in lifetime value for each of the plurality of new 
customers. 

17. The apparatus of claim 16, wherein the calculating module for calculating the gain in 
lifetime value includes: 

a calculating module for calculating a lifetime value based on contract terms and revenue 
generated for each of the plurality of new customers; 

a calculating module for calculating the gain in lifetime value by considering a new 
contract period using the formula ER* t -ER^O) = GLTV. 

18. The apparatus of claim 16, wherein the determining module for determining a focus for a 
loyalty and retention includes: 

an analyzing module for analyzing the shape of the hazard function generated for each of 
the plurality of new customers; and 

a specifying module for specifying a set of marketing techniques based on the shape of 
the hazard function. 
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19. The apparatus of claim 16, wherein the determining module for determining a focus for a 
loyalty and retention program includes: 

a specifying module for specifying a set of incentives offered to the plurality of new 
customers based on the gain in lifetime value. 

20. The apparatus of claim 18, wherein the specifying module for specifying the set of 
marketing techniques based on the shape includes: 

a determining module for determining, based on the shape of the hazard function, there is 
no effect on churn of a contract expiration. 

21. The apparatus of claim 20, wherein the specifying module for specifying the set of 
marketing techniques includes: 

a taking module for taking no further steps to deter churn. 

22. The apparatus of claim 18, wherein the specifying module for specifying the set of 
marketing techniques based on the shape includes: 

a determining module for determining, based on the shape of the hazard function, that 
there is a small increase in probability of churn at contract expiration, with an elevated post- 
expiration churn. 

23. The apparatus of claim 22, wherein the specifying module for specifying the set of 
marketing techniques includes: 

a having module for having a moderate pre-expiration effort where new contracts or 
continued contracts are the goal. 
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24. The apparatus of claim 18, wherein the specifying module for specifying the set of 
marketing techniques based on the shape include: 

a determining module for determining, based on the shape of the hazard function, that 
there is a large spike indicating high probability of churn at contract expiration and low 
probability of churn thereafter. 

25. The apparatus of claim 24, wherein the specifying module for specifying the set of 
marketing techniques includes: 

a concentrating module for concentrating effort on pre-expiration of contract where a 
contract renewal may not be required. 

26. The apparatus of claim 18, wherein the specifying module for specifying the set of 
marketing techniques based on the shape includes: 

a determining module for determining, based on the shape of the hazard function, that 
there is a large increase in probability of churn at expiration with high and increasing post- 
expiration probability of churn. 

27. The apparatus of claim 26, wherein the specifying module for specifying the set of 
marketing techniques includes: 

a having module for having a high intensity pre-expiration effort with continued 
competitive offers to maintain customer. 

28. The apparatus of claim 18, wherein the specifying module for specifying the incentives 
includes: 
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a determining module for determining that value of the set of incentives offered to each of 
the plurality of new customers does not exceed the gain in lifetime value. 

29. The apparatus of claim 18, wherein the analyzing module for analyzing the shape of the 
hazard function includes: 

a clustering module for clustering all of the hazard functions for each of the plurality of 
new customers so that hazard functions with similar shapes can be grouped together. 

30. The apparatus of claim 29, wherein the analyzing module for analyzing the shape of the 
hazard function includes: 

a determining module for determining, based on the overall shape of the clustered hazard 
functions, what retention efforts to take to keep a new customer. 

31. A computer-readable medium containing instructions for evaluating customer value to 
guide loyalty and retention programs comprising: 

calculating an individual customer's tenure based on attributes relating to a plurality of 
current customer accounts; 

generating a hazard function for each of a plurality of new customers to determine 
probability of churn based on the individual customer's tenure; 

calculating a gain in lifetime value for each of the plurality of new customers; and 

determining a focus for a loyalty and retention program based on at least one of the 
hazard function and gain in lifetime value for each of the plurality of new customers. 

32. The computer-readable medium of claim 31, wherein calculating the gain in lifetime 
value includes: 
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calculating a lifetime value based on contract terms and revenue generated for each of the 
plurality of new customers; and 

calculating the gain in lifetime value by considering a new contract period using the 
formula ER*; -ER/O) = GLTV. 

33. The computer-readable medium of claim 31, wherein determining a focus for a loyalty 
and retention includes: 

analyzing the shape of the hazard function generated for each of the plurality of new 
customers; and 

specifying a set of marketing techniques based on the shape of the hazard function. 

34. The computer-readable medium of claim 31, wherein determining a focus for a loyalty 
and retention program includes: 

specifying a set of incentives offered to the plurality of new customers based on the gain 
in lifetime value. 

35. The computer-readable medium of claim 33, wherein specifying the set of marketing 
techniques based on the shape includes: 

determining, based on the shape of the hazard function, there is no effect on churn of a 
contract expiration. 

36. The computer-readable medium of claim 35, wherein specifying the set of marketing 
techniques includes: 

taking no further steps to deter churn. 
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37. The computer-readable medium of claim 33, wherein specifying the set of marketing 
techniques based on the shape includes: 

determining, based on the shape of the hazard function, that there is a small increase in 
probability of churn at contract expiration, with an elevated post-expiration churn. 

38. The computer-readable medium of claim 37, wherein specifying the set of marketing 
techniques includes: 

having a moderate pre-expiration effort where new contracts or continued contracts are 
the goal. 

39. The computer-readable medium of claim 33, wherein specifying the set of marketing 
techniques based on the shape include: 

determining, based on the shape of the hazard function, that there is a large spike 
indicating high probability of churn at contract expiration and low probability of churn thereafter. 

40. The computer-readable medium of claim 39, wherein specifying the set of marketing 
techniques includes: 

concentrating effort on pre-expiration of contract where a contract renewal may not be 
required. 

41. The computer-readable medium of claim 33, wherein specifying the set of marketing 
techniques based on the shape includes: 

determining, based on the shape of the hazard function, that there is a large increase in 
probability of churn at expiration with high and increasing post-expiration probability of churn. 
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42. The computer-readable medium of claim 41, wherein specifying the set of marketing 
techniques includes: 

having a high intensity pre-expiration effort with continued competitive offers to maintain 
customer. 

43. The computer-readable medium of claim 33, wherein specifying the incentives includes: 
determining that value of the set of incentives offered to each of the plurality of new 

customers does not exceed the gain in lifetime value. 

44. The computer-readable medium of claim 33, wherein analyzing the shape of the hazard 
function includes: 

clustering all of the hazard functions for each of the plurality of new customers so that 
hazard functions with similar shapes can be grouped together. 

45. The computer-readable medium of claim 44, wherein analyzing the shape of the hazard 
function includes: 

determining, based on the overall shape of the clustered hazard functions, what retention 
efforts to take to keep a new customer. 

46. A system for evaluating customer value to guide loyalty and retention programs 
comprising: 

means for calculating an individual customer's tenure based on attributes relating to a 
plurality of current customer accounts; 

means for generating a hazard function for each of a plurality of new customers to 
determine probability of churn based on the individual customer's tenure; 
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means for calculating a gain in lifetime value for each of the plurality of new customers; 

and 

means for determining a focus for a loyalty and retention program based on at least one of 
the hazard function and gain in lifetime value for each of the plurality of new customers. 

47. The system of claim 46, wherein means for calculating the gain in lifetime value includes: 
means for calculating a lifetime value based on contract terms and revenue generated for 

each of the plurality of new customers; and 

means for calculating the gain in lifetime value by considering a new contract period 
using the formula ER* { -ER^O) = GLTV. 

48. The system of claim 46, wherein means for determining a focus for a loyalty and retention 
includes: 

means for analyzing the shape of the hazard function generated for each of the plurality of 
new customers; and 

means for specifying a set of marketing techniques based on the shape of the hazard 
function. 

49. The system of claim 46, wherein means for determining a focus for a loyalty and retention 
program includes: 

means for specifying a set of incentives offered to the plurality of new customers based 
on the gain in lifetime value. 

50. The system of claim 18, wherein means for specifying the set of marketing techniques 
based on the shape includes: 
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means for determining, based on the shape of the hazard function, there is no effect on 
churn of a contract expiration. 

5 1 . The system of claim 50, wherein means for specifying the set of marketing techniques 
includes: 

means for taking no further steps to deter churn. 

52. The system of claim 48, wherein means for specifying the set of marketing techniques 
based on the shape includes: 

means for determining, based on the shape of the hazard function, that there is a small 
increase in probability of churn at contract expiration, with an elevated post-expiration churn. 

53. The system of claim 52, wherein means for specifying the set of marketing techniques 
includes: 

means for having a moderate pre-expiration effort where new contracts or continued 
contracts are the goal. 

54. The system of claim 48, wherein means for specifying the set of marketing techniques 
based on the shape include: 

means for determining, based on the shape of the hazard function, that there is a large 
spike indicating high probability of churn at contract expiration and low probability of churn 
thereafter. 

55. The system of claim 54, wherein means for specifying the set of marketing techniques 
includes: 
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means for concentrating effort on pre-expiration of contract where a contract renewal may 
not be required. 

56. The system of claim 48, wherein means for specifying the set of marketing techniques 
based on the shape includes: 

means for determining, based on the shape of the hazard function, that there is a large 
increase in probability of churn at expiration with high and increasing post-expiration probability 
of churn. 

57. The system of claim 56, wherein means for specifying the set of marketing techniques 
includes: 

means for having a high intensity pre-expiration effort with continued competitive offers 
to maintain customer. 

58. The system of claim 48, wherein means for specifying the incentives includes: 

means for determining that value of the set of incentives offered to each of the plurality of 
new customers does not exceed the gain in lifetime value. 

59. The system of claim 48, wherein means for analyzing the shape of the hazard function 
includes: 

means for clustering all of the hazard functions for each of the plurality of new customers 
so that hazard functions with similar shapes can be grouped together. 

60. The system of claim 59, wherein means for analyzing the shape of the hazard function 
includes: 
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means for determining, based on the overall shape of the clustered hazard functions, what 
retention efforts to take to keep a new customer. 
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A method and apparatus for training a neural network to compute hazard functions for 
customers and analyzing hazard functions, both for an individual customer, and for set of 
5 customers to focus marketing techniques. The hazard function represents the likelihood of churn 
for a particular customer. The gain in lifetime value is also calculated for each customer which 
incorporates the present value of the customer with the future value of the customer if a new 
contract is entered. The overall shape of the hazard function, combined with the gain in lifetime 
-■O value, specifies what marketing techniques are to be applied together with what additional 
10^ incentives are to be offered to the customer in order prevent churn. 
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